SCIENTIFIC 

REPpRTS 



X 




SUBJECT AREAS: 

INFORMATION THEORY 
AND COMPUTATION 

QUANTUM PHYSICS 

MATHEMATICS 

GENERAL PHYSICS 



Received 
7 December 2011 

Accepted 
20 January 2012 

Published 
10 February 201 2 



Correspondence and 
requests for materials 
should be addressed to 
X.H.P. (xhpeng@ustc. 
edu.cn) orJ.F.D (dif@ 
ustc.edu.cn) 



An Efficient Exact Quantum Algorithm for 
the Integer Square-free Decomposition 
Problem 



Jun Li 1 , Xinhua Peng 1 , Jiangfeng Du 1 & Dieter Suter 2 



1 Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and 
Technology of China, Hefei, Anhui 230026, People's Republic of China, 2 Fakultdt Physik, Technische Universitdt Dortmund, 44221 
Dortmund, Germany. 

Quantum computers are known to be qualitatively more powerful than classical computers, but so far only a 
small number of different algorithms have been discovered that actually use this potential. It would 
therefore be highly desirable to develop other types of quantum algorithms that widen the range of possible 
applications. Here we propose an efficient and exact quantum algorithm for finding the square-free part of a 
large integer - a problem for which no efficient classical algorithm exists. The algorithm relies on properties 
of Gauss sums and uses the quantum Fourier transform. We give an explicit quantum network for the 
algorithm. Our algorithm introduces new concepts and methods that have not been used in quantum 
information processing so far and may be applicable to a wider class of problems. 

A fundamental tenet of classical computer science is based on the Church-Turing thesis, which asserts that 
any practically realizable computational device can be simulated by a universal computer known as the 
Turing machine 1 . However, this hypothesis implicitly relies on the laws of classical physics 2 and was 
challenged by Feynman 3 and others who suggested that computational devices behaving according to quantum 
mechanics could be qualitatively more powerful than classical computers. A first proof of this conjecture was 
given in 1993 by Bernstein and Vazirani 4 . They showed that a quantum mechanical Turing machine is capable of 
simulating other quantum mechanical systems in polynomial time, an exponential improvement in computa- 
tional power over the classical Turing machine. Their proof did not give an actual fast quantum algorithm, but in 
the following year, Peter Shor came up with his famous factoring algorithm 5 , which solves the integer factoriza- 
tion problem in polynomial time, exponentially faster than any known classical algorithms. The essential part of 
this algorithm is a solution of the order-finding problem, which can be formulated as a hidden subgroup problem 
(HSP) 6 . A hidden subgroup problem is like to find out the period of a given periodic function. The structure of the 
function's periodicity may be so complicated that it can not be easily determined by classical means. The 
importance of the HSP is that various instances (eg. Pell's equation, the principal ideal problem, unit group 
computing) and variants like the hidden shift problem and hidden nonlinear structures encompass most of the 
quantum algorithms found so far that are exponentially faster than their classical counterparts 7 . This relatively 
narrow range of existing fast quantum algorithms shows the urgent need for different types of quantum algo- 
rithms that will make other classes of problems accessible to efficient solutions. 

Here we describe such a quantum algorithm that does not fall into the framework of HSP. It solves two number- 
theoretical problems in polynomial time, i.e., testing the square-freeness and computing the square-free part of a 
given integer. Compared to the known classical algorithms, this provides an exponential increase in computa- 
tional efficiency. While these problems are related to the factorization problem solved by Shor, our algorithm 
relies on a different approach. Furthermore, while Shor's algorithm is probabilistic, the algorithm presented here 
is exact and its computational complexity is lower. 

We consider a positive integer AT with its unique prime factor decomposition N =p^p2 2 ' "P^ (P« are P rimes )- 
N is called square-free if no prime factor occurs more than once, i.e., for all i (i = 1, 2, k), a z = 0 or 1. An 
arbitrary positive integer can always be written as 

N = r-s 2 , (1) 

where r is square-free, and this square-free decomposition is unique. Thus, usually r and s 2 are called the square- 
free part and the square part of N, respectively. The square-freeness testing problem corresponds to determining 
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whether 5=1. An additional problem consists in finding the square- 
free part r of N. These problems were listed as two unsolved open 
problems 8 , since no efficient algorithm is currently known for either 
of them. Actually they may be no easier than the general problem of 
integer factorization 9 . It was found 10 that the factorization of N = pq 2 
(p, q both prime) is almost as hard as the factorization ofN = pq. This 
fact has been used in a proposed digital signature scheme called TSH- 
ESIGN, which is more efficient than any representative signature 
scheme such as elliptic curve and RSA based signature 10 . A concrete 
estimation of the lower bound of classical Boolean circuit complex- 
ity 11 showed that testing square-free numbers by unbounded fan-in 
circuits of bounded depth requires a superpolynomial size. On the 
other hand, the square-free part problem appears to be a represent- 
ative of a larger class of computational problems. As an example, 
computing the ring of integers of an algebraic number field, one of 
the main tasks of computational algebraic number theory, reduces to 
it in deterministic polynomial time 1213 . 

We now describe an efficient, exact quantum algorithm that solves 
both problems. It uses the Gauss sum 32 " 34 , an important object which 
has been extensively investigated in mathematics (see supplementary 
information). Throughout this paper, we will assume that AT is an odd 
integer (the case of even numbers can be trivially reduced to this 
case). The Gauss sum is defined as 



G(a, X )= J2*n( 



2niam/N 



where a is an integer and the function x N {m) represents the Jacobi 
symbol of m relative to AT 14 . 

The evaluation of the Gauss sum is closely related to the square- 
freeness of N. Let notation (x, y) indicate the greatest common 
divider (GCD) of x and y. If N is square-free, then we have 

G(a,x)=0, V(a,N)>l. (2) 



Conversely 15 , if N is not square-free 

G(a, %)=0, V(fl,N) = l. 



(3) 



This remarkable fact suggests a dichotomy criterion for testing 
square-freeness, it represents the cornerstone of our algorithm. 

Results 

We present the algorithm first for the relatively simple case where N 
= pq 2 (p, q both prime) and subsequently generalize it. The algorithm 
consists of two parts, as illustrated in Fig. 1. In the first part, we 
generate the state 

W = ^= E x(m 9 N)\m) 9 (4) 

where the normalization coefficient (p(N) represents Euler's function 
(number of integers smaller than AT that are coprime to N). van Dam 
and Seroussi 16 proposed a general method for preparing such a su- 
perposition state. They gave the example of computing the Legendre 
symbol, which is a special case of the Jacobi symbol, which reduces to 
the Legendre symbol when N is prime. They also computed the 



Jacobi symbol for the case when the factorization of N is known. 
In our case, the factors of N are not known. Thus we would adopt 
another technique for computing the Jaocbi symbol 17 , which we 
discuss in the following. The second part of the algorithm is to apply 
the quantum Fourier transform (QFT) to \</>). The resulting state 
encodes the factors p and q of N, which can be retrieved by perform- 
ing measurements on the qubits. 

Now we discuss the details of the algorithm. Set n = [log N] , the 
smallest integer for which 2" > N. We need two main registers A and 
B, both initialized to |0)®". Additional registers needed for storing 
auxiliary variables and constants are not represented explicitly for 
simplicity. The first part starts with a state uniformly superposed 
from 1 to N — 1, which is prepared just by an N — 1 dimensional 
Fourier transform on register A and a subsequent addition with 1 



| 0 )f | 0 )f - 



1 MM — 1 
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Note that this Fourier transform is of order N — 1 , and it was known 18 
that the quantum fast Fourier transform can be made exact for 
arbitrary orders. Next we compute the greatest common divisor of 
m and N into register B 

U > ■■^^ N f2\ m ^\0)f n ^^^JlH A \(M,N)) B . (5) 

1 m=l 



1 m=l 



Classically, the GCD problem can be efficiently solved by the classical 
Euclidean algorithm in quadratic polynomial time. In order not to 
involve the complicated division arithmetics of the Euclidean algo- 
rithm, we prefer to adopt the extended Euclidean algorithm 19 . The 
extended Euclidean algorithm can be directly generalized to a 
quantum GCD algorithm that operates on a superposition state with 
the same computational complexity (see supplementary information 
for the quantum network construction). 

We then take a measurement M x of register B. If the result is not 1, 
then it must bep or q or q 2 and clearly the algorithm already succeeds. 
However, it's highly possible that we would not obtain such results, 
and the algorithm continues. This is because the probability of 
obtaining (m, N) = 1 is (p(N)/(N - 1) = (p - l)(q - 1)1 {pq - 1), 
which asymptotically approaches 1 for sufficiently large p and q. If 
Mi results in 1, we get 

1 



i 



The next step is to obtain the state \<j>) as given in (4), i.e., we do the 
following unitary operation on register A 

. 1 |m)->— x(ra)lm), 



(6) 



where /(m) are 1 or —1 as by the definition of Jacobi symbol, and 
register B is omitted. The key part of U 2 is to compute the Jacobi 
symbol #(m) for all (m, N) = 1. Classically, the Jacobi symbol can 
be efficiently solved by many algorithms. There exists 20 a binary 
algorithm which has the advantage of lower complexity and easier 




Figure 1 | Outline of quantum circuit for computing the square-free part for N = pq 2 . The procedure denoted as Q in the text consists of two main parts. 
In the first part, we generate the state I (/)}; in the second part, we apply the quantum Fourier transform (QFT) to it. Single lines represent qubits, and boxes 
represent operations. Time runs from left to right. The transformation U x and U 2 are defined by Eq. (5) and Eq. (6). The meters M x and M 2 represent the 
measurements. The double lines coming from M l carry the classical bits, here the algorithm continues only if register B collapses to 1. 
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implementation on a binary computer. The binary algorithm can be 
seen as a variant of the extended Euclidean algorithm, and hence 
can also be extended to a quantum algorithm (see supplementary 
information for the quantum network construction). 

As the last step of the algorithm, we take a Fourier transform on | (j)) 
and obtain 



E*)ii' 



,2nimk/N 



(?) 



According to the properties (2) and (3) of the Gauss sum, all ampli- 
tudes vanish unless k shares a nontrivial common factor with N. If we 
perform a measurement M 2 on the register, it always collapses to a 
state |/c 0 ), whose GCD with AT is a non-trivial factor p or q of N. It 
therefore yields the complete decomposition of N. 

We now determine the computational complexity of this algo- 
rithm. All the transformations involved in the algorithm, including 
the extended Euclidean algorithm for GCD and Jacobi symbol and 
QFT, require 0((log N) 2 ) elementary gate operations 6 . Thus this 
algorithm has only a polynomial-time complexity. 

For a general AT with possibly many distinct prime factors (square- 
freeness of N is unknown), the procedure outlined above may not 
work. However, it can be generalized to include this case, and the 
generalized algorithm remains simple and efficient. We refer to the 
algorithm described above as Q and discuss now the generalized 
algorithm, which includes Q as a subroutine. 



As we discussed, the algorithm Q includes two measurements, M x 
and M 2 . With a certain probability, M Y yields a nontrivial factor of N. 
If this does not happen, we proceed to the second measurement M 2 . 
Two possibilities will occur at M 2 due to the dichotomy property of 
Gauss sum (2, 3) : we obtain (i) a non-trivial factor of AT if AT is not 
square-free, or (it) a result coprime to AT, which signifies that N is 
definitely square-free. As a result, no matter whether Q ends at M Y or 
M 2 , it either yields a non-trivial factor (say c) of N or determines that 
N is square-free. In the latter case, we have succeeded already, hence 
the algorithm finishes. In the former case, if the two parts c and N/c 
share a common factor d = (c, N/c), we know that d 2 is a factor of the 
square part s 2 of N. We thus can split the problem of finding the 
square-free part r of AT into two smaller problems: finding the square- 
free parts of c/d and N/(cd). From the solutions of these subproblems, 
we find the corresponding parts of N as 



r = R(N)=R(c/d)-R(N/cd) 
s 2 = S(N) = S(c/d) -S(N/cd) -d 2 . 



(8) 



Here, R( ■ ) and S( ' ) represent the square-free part and the square part 
of their argument, respectively. Clearly, this procedure can be iter- 
ated until all branches have determined that the arguments are 
square-free. Figure 2 illustrates this recursive procedure. 

The execution time of the extended algorithm reaches a maximum 
when each execution of Q yields just one factor, but clearly, the 
number of repetitions is still bounded by 0(log N). Each execution 
of the subroutine Q requires at most 0((log AT) 2 ) steps. The worst- 
case complexity of the extended algorithm is therefore 0((log N) 3 ). 
Actually, we have a better estimation of how long it takes untill the 



a 



// (x,N)=c>l 





Figure 2 | Schematic flow chart of the recursive quantum algorithm for computing the squrare-free part of an arbitrary odd integer N. (a) Possible 
outcomes of applying the algorithm Q on an arbitrary odd integer N: either return a factor c or else ensure that AT is a square-free number with the square- 
free part r = N. If a factor c > 1 is returned, and AT is tested to be not square-free, then the problem is converted to two smaller sub-problems for c/d and N/ 
cd where d is the greatest common divisor of c and A//c.This serves as the subroutine of the recursive quantum algorithm, (b) Recursive algorithm for a 
general AT. Different colors are used to designate two different outcomes after applying the subroutine Q. The red color denotes that number is square-free, 
then this branch terminates. The blue color denotes the other outcome; in this case, the algorithm proceeds to the next step of recursion. The Q operation 
needs to be performed at most log AT times to solve this problem. 
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algorithm succeeds. This is by virtue of the observation that M 2 yields 
the square part with high probability, and calculations show that the 
algorithm will finish with high probability in just 0((log N) 2 (\og log 
N) 2 ) (see methods). 

Discussion 

Classically, finding the square-free part of an integer is believed to be 
very difficult. It was argued 10 that the best method known for its 
solution is through factorization. The fastest classical algorithm for 
factorization would be the number field sieve 21 , which requires 
0(exp(c(log AF) 1/3 (log log AT) 273 )) steps. Thus the quantum algorithm 
presented here offers an exponential speed-up over the classical 
algorithm. A feasible alternative to our algorithm would be to use 
Shor's algorithm to obtain the complete decomposition of N also in 
polynomial time. Application of Shor's algorithm yields, with some 
probability, two divisors of AT in time 0((log N) 2 log log AT log log log 
AT) 6 . Like our algorithm, Shor's algorithm would thus also be applied 
repetitively, with the number of iterations bounded by 0(log AT). The 
overall computational complexity using Shor's algorithm would be 
0((log AO 3 log log AT log log log AO- We further remark that achieving 
complete factorization through Shor's algorithm raises more subtle- 
ness. A necessary part of complete factorization is primality test, 
however Shor's algorithm fails to recognize a prime number 
with probability 1, this of course increases algorithmic complex- 
ity 30 ' 31 . Figure 3 compares the computational costs of the three algo- 
rithms described above, clearly showing the increase in 
computational efficiency by the algorithm presented here. 

Our algorithm relies on the mathematical properties of the Gauss 
sums. The possibility of using the periodicity properties of Gauss 
sums for factorization was suggested earlier 22,23 and the feasibility 
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Figure 3 | Comparison between the computational costs of the three 
algorithms discussed in the text. Both quantum algorithms offer 
exponential speedup over the classical methods. For hundreds of digits, 
our algorithm is almost two orders of magnitude faster than the Shor's 
algorithm. 



of this approach was demonstrated in various physical systems 
including nuclear magnetic resoance 24 " 26 , cold atoms 27 and super- 
conducting circuits 28 . However, these schemes did not use the 
specific properties of quantum mechanical systems. They can be 
implemented in classical as well as in quantum systems and the 
scaling properties are therefore not superior to other classical algo- 
rithms 32 ' 33 . In contrast, the algorithm that we have described in this 
paper relies on quantum superpositions and is both efficient and 
exact in solving the square-free part computation problem, even 
demonstrates advantages over Shor's approach. In Shor's algorithm, 
the major cost comes from the modular exponentiation operation, 
while Gauss sums can be generated through 0((log AT) 2 ) modular 
square operation. In our algorithm, we have noticed that Gauss sum 
evaluations are closely related to the factorization of N. While we 
have not found such an algorithm so far, it may thus be possible to 
develop a quantum algorithm on the basis of Gauss sums that solves 
integer factorization. 

Methods 

Realization of Ui . Ui is to compute the greatest common divisor of m and N. 
Classically, the GCD problem can be efficiently solved based on the famous extended 
Euclidean algorithm. There is a variant of this algorithm, called the binary GCD 
algorithm, which can be more conveniently performed on a binary computer. We 
adopt this method here, and succeed in finding a quantum network that performs the 
binary GCD algorithm on a quantum superposition state (see supplementary 
information for details). 



Realization of U 2 . In Fig. 1, the operation U 2 is realized through the following steps 

initial state 

apply the operation of Jacobi symbol 
computation 



(0). £ \m) A \l) B 

(m,N) = l 
(m,N) = l 



(2) . £ e m<a ^ 1 ^ 2 \m) A \x(m)) B apply conditional phase shifts 

(m,N) = l 

(3) . £ %(m) | m) A 1 1) B apply step (1) in reverse order 

(m,N) = 1 

(4) . = \<P) A W B , 

where we use the phase kickback trick 16 and the identity e i7l (x(m)-i)n _ ^( m ) The 
computation of Jacobi symbol can be implemented by binary Jaocbi algorithm (see 
supplementary information for details). 

Complexity estimation of the algorithm. In the following, we discuss the algorithm 
complexity for a general case N. To do this, we slightly change the algorithm presented 
in the text. Our analysis is based on the finding: if at the measurement M 2 we obtain a 
result whose common divisor with N is a square number, then the common divisor 
must be the square part s 2 of N; and the probability of this case is larger than ((p(N)/N) 2 
(see supplementary information for proofs). Hence the algorithm can be altered in the 
way that if any branch of the algorithm proceeds to M 2 and results in a square 
number, then that branch terminates. 

Denote P( * ) as the probability of obtaining the square part of its argument by 
application of Q. Letp k denotes the probability that the algorithm succeeds at the A;-th 
iteration step. Obviously 



where P(N) > ((p(N)/N) 2 . If Q does not succeed at the first step, and suppose we have 
obtained c and N/c and d = (c,N/c), then 



p 2 = (l-P(N))pQp fI " 



>(1-P(N)) 



>(l-P(N)) 



d) \cd 

(p{c/d) (p{N/cd) 
~c/d N/cd 



N J 
= (1-P(N))P(N). 

Here, the second inequality is valid because of a basic property of the Euler function 
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(p(mn) = (p(m)(p(n) 



gcd(m, n) 
(p(gcd(m, n)) ' 



Analogously, we will have 



p 3 >(l-P(N)) 2 P(N) 



p k >(l-P(N)) k X P(N). 



Therefore, after k steps, the probability that the algorithm still does not succeed is 
Q < 1 = g (1 -P(N)) i - 1 P(N) = (1 -P(N)) k < ^1 - 2 j . 

According to the inequality (Theorem 8.8. 7 29 ) 
4>(N) ^ 1 



N e>'loglogN +I ^' 

where y = 0.5772. . . is the Euler-Mascheroni constant, and for a large N, <p(N)/N > II 
(2 log log IV). 



So we have 



Q< 1 



2 log log N 



(9) 



When /c = 0((log log N) 2 ), Q —> 0, this means, the algorithm doesn't need to go for 
k = 0(log N) times, but would finish with high probability in 0((log log N) 2 ) steps. 
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